Biostatistics For Dummies (Monika Wahi John Pezzullo)

Chapter 16

Getting Straight Talk on Straight-Line

Regression

IN THIS CHAPTER

Determining when to use straight-line regression

Running a straight-line regression and making sense of the output

Examining results for issues and problems

Estimating needed sample size for straight-line regression

Chapter 15 refers to regression analyses in a general way. This chapter focuses on the simplest type of

regression analysis: straight-line regression. You can visualize it as fitting a straight line to the points

in a scatter plot from a set of data involving just two variables. Those two variables are generally

referred to as X and Y. The X variable is formally called the independent variable (or the predictor or

cause). The Y variable is called the dependent variable (or the outcome or effect).

Knowing When to Use Straight-Line Regression

You may see straight-line regression referred to in books and articles by several different

names, including linear regression, simple linear regression, linear univariate regression, and

linear bivariate regression. This abundance of references can be confusing, so we always use the

term straight-line regression.

Straight-line regression is appropriate when all of these things are true:

You’re interested in the relationship between two — and only two — numerical variables. At least

one of them must be a continuous variable that serves as the dependent variable (Y).

You’ve made a scatter plot of the two variables and the data points seem to lie, more or less, along

a straight line (as shown in Figures 16-1a and 16-1b). You shouldn’t try to fit a straight line to data

that appears to lie along a curved line (as shown in Figures 16-1c and 16-1d).

The data points appear to scatter randomly around the straight line over the entire range of the

chart, with no extreme outliers (as shown in Figures 16-1a and 16-1b).